A New Algorithm for Identifying Loops in Decompilation

نویسندگان

  • Tao Wei
  • Jian Mao
  • Wei Zou
  • Yu Chen
چکیده

Loop identification is an essential step of control flow analysis in decompilation. The Classical algorithm for identifying loops is Tarjan’s interval-finding algorithm, which is restricted to reducible graphs. Havlak presents one extension of Tarjan’s algorithm to deal with irreducible graphs, which constructs a loop-nesting forest for an arbitrary flow graph. There’s evidence showing that the running time of this algorithm is quadratic in the worst-case, and not almost linear as claimed. Ramalingam presents an improved algorithm with low time complexity on arbitrary graphs, but it performs not quite well on “real” control flow graphs (CFG). We present a novel algorithm for identifying loops in arbitrary CFGs. Based on a more detailed exploration on properties of loops and DFS, this algorithm traverses a CFG only once based on DFS and collects all information needed on the fly. It runs in approximately linear time and does not use any complicated data structures such as Interval/DSG or UNION-FIND sets. To perform complexity analysis of the algorithm, we introduce a new concept called unstructuredness coefficient to describe the unstructuredness of CFGs, and we find that the unstructuredness coefficients of these executables are usually small (<1.5). Such “lowunstructuredness” property distinguishes these CFGs from general single-root connected directed graphs, and it offers an explanation why those algorithms existed perform not quite well on real-world cases. The new algorithm has been applied to 11526 CFGs in 6 typical binary executables on both Linux and Window platforms. Experimental result has validated our theoretical analysis and it shows that our algorithm runs 2-5 times faster than the Havlak-Tarjan algorithm, and 2-8 times faster than the Ramalingam-Havlak-Tarjan algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Structuring Algorithm for Decompilation

This paper presents a structuring algorithm for arbitrary reducible, unstructured graphs. Graphs are structured into semantically equivalent graphs, without the need of code replication or introduction of new variables. The algorithm makes use of structures such as, if..then..elses, while, repeat and loop loops, and case statements. Gotos are only used when the graph cannot be structured with a...

متن کامل

Native x86 Decompilation using Semantics-Preserving Structural Analysis and Iterative Control-Flow Structuring

There are many security tools and techniques for finding bugs, but many of them assume access to source code. We propose leveraging decompilation, the study of recovering abstractions from binary code, as a technique for applying existing source-based tools and techniques to binary programs. A decompiler must have two properties to be used for security: it must (1) be correct (is the output fun...

متن کامل

Native x86 Decompilation Using Semantics-Preserving Structural Analysis and Iterative Control-Flow Structuring

There are many security tools and techniques for finding bugs, but many of them assume access to source code. We propose leveraging decompilation, the study of recovering abstractions from binary code, as a technique for applying existing source-based tools and techniques to binary programs. A decompiler must have two properties to be used for security: it must (1) be correct (is the output fun...

متن کامل

Solving the tandem AGV network design problem using tabu search: Cases of maximum workload and workload balance with fixed and non-fixed number of loops

A tandem AGV configuration connects all cells of a manufacturing area by means of non-overlapping, sin-gle-vehicle closed loops. Each loop has at least one additional P/D station, provided as an interface between adjacent loops. This study describes the development of three tabu search algorithms for the design of tandem AGV systems. The first algorithm was developed based on the basic definiti...

متن کامل

A Threshold Accepting Algorithm for Partitioning Machines in a Tandem Automated Guided Vehicle

Abstract : A tandem automated guided vehicle (AGV) system deals with grouping workstations into some non-overlapping zones , and assigning exactly one AGV to each zone. This paper presents a new non-linear integer mathematical model to group n machines into N loops that minimizes both inter and intra-loop flows simultaneously. Due to computational difficulties of exact methods in solving our pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007